Protein Fold Class Prediction: New Methods of Statistical Classification

نویسندگان

  • Janet Grassmann
  • Martin Reczko
  • Sándor Suhai
  • Lutz Edler
چکیده

Feed forward neural networks are compared with standard and new statistical classification procedures for the classification of proteins. We applied logistic regression, an additive model and projection pursuit regression from the methods based on a posterior probabilities; linear, quadratic and a flexible discriminant analysis from the methods based on class conditional probabilities, and the K-nearest-neighbors classification rule. Both, the apparent error rate obtained with the training sample (n = 143) and the test error rate obtained with the test sample (n = 125) and the 10-fold cross validation error were calculated. We conclude that some of the standard statistical methods are potent competitors to the more flexible tools of machine learning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-class protein fold recognition using support vector machines and neural networks

MOTIVATION Protein fold recognition is an important approach to structure discovery without relying on sequence similarity. We study this approach with new multi-class classification methods and examined many issues important for a practical recognition system. RESULTS Most current discriminative methods for protein fold prediction use the one-against-others method, which has the well-known '...

متن کامل

Support Vector Machines for Protein Fold Class Prediction

Knowledge of the three-dimensional structure of a protein is essential for describing and understanding its function. Today, a large number of known protein sequences faces a small number of identified structures. Thus, the need arises to predict structure from sequence without using time-consuming experimental identification. In this paper the performance of Support Vector Machines (SVMs) is c...

متن کامل

Enhancing Protein Fold Prediction Accuracy Using an Ensemble of Different Classifiers

Protein fold prediction problem is considered as a key point to protein structure recognition and structural discoveries. Recent advances in pattern recognition field brought a great interest to apply pattern classification techniques to tackle this problem. From the pattern recognition point of view, the protein fold prediction problem can be expressed as a multi-class classification task that...

متن کامل

Multi-class Protein Fold Recognition Through a Symbolic-Statistical Framework

Protein fold recognition is an important problem in molecular biology. Machine learning symbolic approaches have been applied to automatically discover local structural signatures and relate these to the concept of fold in SCOP. However, most of these methods cannot handle uncertainty being therefore not able to solve multiple prediction problems. In this paper we present an application of the ...

متن کامل

Protein Secondary Structure Prediction Using Support Vector Machines and a New Feature Representation

Knowledge of the secondary structure and solvent accessibility of a protein plays a vital role in the prediction of fold, and eventually the tertiary structure of the protein. A challenging issue of predicting protein secondary structure from sequence alone is addressed. Support vector machines (SVM) are employed for the classification and the SVM outputs are converted to posterior probabilitie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proceedings. International Conference on Intelligent Systems for Molecular Biology

دوره   شماره 

صفحات  -

تاریخ انتشار 1999